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Abstract. We obtain the phase diagram of random Boolean networks with nested canalizing functions. 
Using the annealed approximation, we obtain the evolution of the number bt of nodes with value one, and 
the network sensitivity A, and we compare with numerical simulations of quenched networks. We find that, 
contrary to what was reported by Kauffman et al. [Proc. Natl. Acad. Sci. 2004 101 49 17102-7], these 
networks have a rich phase diagram, were both the "chaotic" and frozen phases are present, as well as 
an oscillatory regime of the value of b t . We argue that the presence of only the frozen phase in the work 
of Kauffman et al. was due simply to the specific parametrization used, and is not an inherent feature of 
this class of functions. However, these networks are significantly more stable than the variants where all 
possible Boolean functions are allowed. 



^h 1 Introduction 



Boolean networks (BN) were introduced by Kauffman |1I2| 
as a simple model of gene regulation. In this model the 
transcription states of the genes are described by Boolean 
variables, and their dependency to other genes by Boolean 
functions. In Kauffman's original model, which is usu- 
ally called a Random Boolean Network (RBN) [3], both 
the functions and their inputs are randomly distributed 
among all possible choices. Since this clearly discards any 
possible structure which may be selected by the evolu- 
tionary process, these networks are null models of gene 
regulation, from which general features may be derived, 
which are independent of the missing details |4J. Indeed 
this simple model already shows an emergent behaviour 
which may be applicable to real system, namely the ex- 
istence of two dynamical phases: Frozen and "chaotic". In 
the frozen phase, small perturbations have a limited prop- 
agation, and eventually stop. In the chaotic phase, small 
perturbations propagate exponentially fast, often reach- 
ing a finite portion of the system. It has been argued [I] 
that real systems may share features with RBNs which 
lie exactly at the interface between these two phases - 
the so-called critical networks. In this point of the con- 
figuration space, small perturbations propagate only lin- 
early; thus the system retains some stability of the frozen 
phase, as well as some of the excitability of the chaotic 
phase, which may be necessary for the system to respond 
to external signals. Although this a very interesting fea- 
ture, a more plausible comparison with real gene regula- 
tion can only be made if more realistic properties are in- 
cluded, such as more realistic topologies [5 6 7 8I9I1UI1T] 
or update functions |12|13|14*] . for instance. In this paper 



we will consider RBNs with nested canalizing functions 
(NCFs), introduced in |15I16| . These functions are a nat- 
ural extension of the concept of canalization often present 
in biological systems |17I18) . A function with a canalizing 
input is such that if this input is at its canalizing value 
(either 1 or 0) , then the output of the function is automat- 
ically defined, for any combination of the remaining input 
values. It has been observed that many real functions have 
a canalizing input [18 . If this concept is carried out to the 
remaining inputs, such that a hierarchy of canalization is 
present, the resulting function is a nested canalizing func- 
tion. Interestingly, the majority of the real functions stud- 
ied in [TB] are also NCFs [15]. In this work we will obtain 
the phase diagram of RBNs with NCFs. Contrary to what 
was claimed in |16| . such networks possess a rich phase 
diagram, where both the chaotic and frozen phases oc- 
cupy sizable portions of the configuration space. We also 
observe oscillations in the number of values of l's in the 
network, for a portion of the parameter space. 

This paper is divided as follows. In Sec. [2] we define the 
model, as well as nested canalizing functions, and review 
some known facts. In Sec. |3] we obtain the evolution of 
fraction b t of l's, and in Sec. [4] we obtain the network 
sensitivity A and the phase diagrams. We then conclude 
in Sec. [5j and provide some final considerations. 

2 The Model 

A BN is defined as a directed network of N nodes repre- 
senting Boolean variables a £ {1,0}^, which are subject 
to a dynamical update rule, 



a i (t+l) = f i (a(t)) 



(1) 
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where fi is the update function assigned to node i. which 
depends exclusively on the states of its inputs. In this 
work we consider that all nodes are updated in parallel. 
All nodes have the same number of inputs k which are 
chosen randomly. 

Starting from a random configuration, the dynamics of 
the system evolves and eventually settles on an attractor, 
after a transient time. We will characterize the properties 
of the system after this transient time by the fraction bt of 
the number of l's in the network, and by the network sen- 
sitivity A, which will differentiate between the dynamical 
phases. 



2.1 Nested canalizing functions 

As introduced in [T5], a nested canalizing function /({cr,}), 
with inputs {<Ji}, i £ [0..k — 1] , is defined as 
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where Ci £ [0, 1] is the canalizing value of input i and, and 
Si £ [0, 1] is the output value associated with input i. The 
value Sd is the default output of the function, when no 
input is at its canalizing value. It is usually assumed that 
Sd = 1 — Sfe_i, so that every input is sensitive. However, 
in this paper we will also consider the situation where Sd 
is a free parameter. We will call an input with Sj = 1 an 
activator, and a deactivator otherwise; and if an input is 
at its canalizing value, we will say it is canalized. 

We will consider RBNs where the functions are chosen 
randomly from all possible Nested Canalizing functions, 
which are weighted according to the following parameters: 
a, the probability that an input is an activator (i.e. Si = 1), 
c, the probability that the canalizing value of an input is 
1, and d, the probability that the default output is 1. In 
the situation where Sd = 1 — Sk-i this last parameter is 
omitted. 

In [TS] it was shown that many eukaryotic genes seem 
to be regulated by canalizing functions, and in |15| it was 
further identified that all but 6 of the 139 genes studied 
in [TB] are in fact regulated by NCFs. There is a simple 
interpretation for the existence of nested canalizing func- 
tions in real gene regulation networks: Genes are used 
to encode mRNA via a protein called RNA Polymerase 
(RNAP) . This protein binds to a region of the DNA called 
the promoter region, which starts shortly before the gene 
itself. Genes which serve as inputs for other genes encode 
proteins which are called transcription factors (TF) . These 
proteins bind to the upstream region of the gene, i.e. the 
region preceding (and including) the promoter region. The 
presence of a TF close to the promoter region may increase 



or decrease the probability that the RNAP will bind, and 
initiate the transcription. It is easy to imagine a situation 
where an hierarchy of TFs exists, where a TF which binds 
closer to the promoter region has more relevance, and in- 
creases or decreases the binding probability of RNAP by 
such a factor that it overrides the TFs which are bound 
further away, giving rise to a (nested) canalizing input. 

Let us appreciate how much of a deviation is the oc- 
currence of NCFs, in comparison to canalizing functions 
on only one input, as well as to all possible functions, 
by considering the total number of functions belonging to 
each class. Nested canalizing functions are identical to the 
unate cascade functions known in computer science |19j . 
which have optimal properties regarding their computa- 
tion time via binary decision diagrams |20| . and for which 
many properties are known. According to [21_, the number 
of different NCFs with k inputs scales as N nc ~ afc^^ 1 " 1 ^ ln2 ), 
for fc 3> 1, where a is some constant. For comparison, con- 
sider the number of functions which are canalizing on at 
least one input [25], N c ~ 4fc2 2 . Although the frac- 
tion of both these classes relative to the total number 
2 2 of functions with fc inputs vanishes for large fc, using 
the Stirling approximation for fc!, we can easily see that 
the fraction of NCFs is smaller by a factor of N nc /N c ~ 

1/2 2 . Thus the presence of nested canalization is a 
much stronger deviation from a random distribution than 
single-input canalization. 



3 The evolution of number of l's, b t . 

In order to obtain the fraction b t of l's in the network at 
time t, we will employ the annealed approximation [23]. 
This is a mean-field approximation, which assumes that 
the inputs of each function are randomly chosen at each 
time step. By construction, this forbids local correlations 
from arising, which makes the analysis easier. Since a 
quenched disorder should be indistinguishable from an an- 
nealed one, in the limit of large networks, this approxima- 
tion is expected to be exact for large RBNs. 

We begin by defining the probability 7(64) that a ran- 
dom input in the network is at its canalizing value, 



j(b t ) = b t c + (l-b t )(l-c). 



(3) 



Thus, the probability that the output of a randomly cho- 
sen function will be 1 at the next time step is given simply 
by the probability that at least one input is canalized and 
is an activator, or that the default output value is 1, 
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(4) 



The situation where Sd — 1 — Sfc-i can be obtained sim- 
ply by setting d = 1 — a. The equilibrium value b* = b^ 
can then be obtained by solving the above equation for 
b L+ i = b t = b* . We note that lim^oo b* — a, except for 
{a,c) £ {(0, 1), (1,0)}, in which case b* = d. In Fig. [TJ it is 
shown some of the solutions as a function of a, compared 
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Fig. 1. Fraction b* of the number of l's in the network af- 
ter the transient time, as a function of the fraction a of ac- 
tivators. On top, for c = 1 and d = 1, we have b* > only 
for a > a* , where a* — 1/k. On the bottom it is shown the 
bifurcation diagrams, from an oscillatory regime to fixed 
point, for the parameters indicated in the legend. For both 
figures, the symbols represent averages over 10 indepen- 
dent numerical realizations of the dynamics, for networks 
with TV = 10 5 , and the solid lines are solutions of Eq. [I] 



with numerical simulations. There are two interesting be- 
haviors: 1. If c = 1 and d — (or symmetrically if c = 
and d = 1, not shown), there is a second-order transition 
of the value of b* at a = 1/k (or a = 1 — 1/k), below which 
b* = and above which 6* > 0. For other values of c and 
d, this behaviour is replaced by a continuous variation of 
b* (not shown); 2. If \d — c\ is small enough, the value 
of b t shows an oscillatory behaviour after a the transient 
time, between two values of b* . These oscillations hap- 
pen when most of the default output values correspond 
to the canalizing values of a large portion of the inputs, 
and most canalized outputs are different from the canal- 
izing values. In such situation, the canalized inputs tend 
to deactivate themselves, which in turn will increase the 
default activations, and so on. The transition from this 



period-2 oscillation to a fixed point is through a pitch- 
fork bifurcation, where the fixed point becomes unstable 
(\dbt+i/dbt(b*)\ > 1) and gives rise to the oscillation (as 
shown in the bottom of FigfTl). 



4 The network sensitivity, A. 

The response of the system to small perturbations is char- 
acterized by the network sensitivity A [24125] . which is 
defined as k times the probability that the output of ran- 
domly selected function will change if a randomly selected 
input is flipped. Thus, for large networks, the average 
number of nodes flipped after some small time t <C In N 
should be A*. Therefore if A < 1 the number of affected 
nodes will tend to zero after some time, and the dynamics 
is said to be in the frozen phase. If A > 1, the number 
of affected nodes will increase exponentially, and the dy- 
namics is said to be in the "chaotic" phase. For the special 
value of A = 1, the average number of affected nodes in- 
creases linearly with time, and the dynamics is said to be 
in the critical line between the two phases. 

We can obtain the value of A for RBNs with NCFs with 
the annealed approximation, as we did for the values of b* . 
We start by defining the probability 77 that two consecutive 
inputs have the same canalized output, 



77 = a 2 + (1 



(5) 



and the probability rj that the last input has the same 
canalized output as the default output, 



770 = ad + (1 — a)(l — d). 



(6) 



Using this, we can write the probability Ai that the output 
will be flipped if input i is flipped, 



a, = a - 7 (nr [(1 - (i - 7(6*)) fc - i - 1 )(i - v) + 

(l- 7 (6*))*- < - 1 (l-»? )], 



(7) 



which accounts for the probability that all the inputs j < i 
are not canalized, and that the output of i is different than 
any input j > i which may be canalized. The sensitivity 
value A = J^ A, amounts then simply to, 

A = (I-*?) l (1 7J>. (6 * ))fc +k(l- 1 (b*)) k - 1 (v-V°)- (8) 
7(6*) 

The situation where sa = 1 — s^-i can be obtained by 
setting d = 1 — a, as beforehand making the substitution 



rf(l — 6i } k-i) i n Eq. m which yields 
1- (l- 7 (6*))* 



x s = (i-v)- 



7(6* 



(1 - 7 (&*)) fc - 1 [Hv - 1) + (1 - »7°)(* - 1) + 1] • 

(9) 

We note that A s > A for small values of k, but for k ^> 
1 we have X s ~ A, and this value approaches limfc_ i . 00 A = 
(1 — 77)77(6*), for 7(6*) > 0. This means that in the limit of 
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Fig. 2. Network sensitivity A (Eq.M) for several parameter values, as indicated in the labels and axes. The critical line 
A = 1 is indicated by the solid lines. The regions in white correspond to parameter regions where the system displays 
oscillations, and thus the value of A is not well defined. The points marked with stars (•) were verified empirically in 
Fig. [4] 
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Fig. 3. Network sensitivity X s (Eq.KH) for the case were Sd = 1 — Sk-i, for several parameter values, as indicated in the 
labels and axes. The critical line A s = 1 is indicated by the solid lines. The regions in white correspond to parameter 
regions where the system displays oscillations, and thus the value of A s is not well defined. The points marked with 
stars (*) were verified empirically in Fig. El 



large k, NCFs tend to be much more stable than randomly 
chosen functions, which have A = fc/2. 

In Figs. [2] and [3] are the phase diagrams for several 
parameter values. We note that these diagrams are sym- 
metric in respect to the appropriate parameter transfor- 
mations, such as (c, d) -R> (1 — c, 1— d). For d — and c = 1, 
we observe that the critical line is composed of the a = 1/2 
line plus the a = 1/k line, which corresponds to the criti- 
cal values of a where b* > 0, as discussed previously. We 
note that, since A = (1 — v)/l(b*) f° r k 3> 1, there is a 
universal critical line at a — 1/2 for which r\ = 1/2 and 
7(6*) = 1/2, and thus A = 1, independent of c and d. 
This is interesting, since it means that the most entropic 
situation a = c = d = 1/2 leads to a critical network, if 
k is not too small, and either changing c or d leaves the 
value of A unmodified. Thus criticality can be attained for 
a large number of possible configurations, which is maybe 



a reason why this class of functions are favoured biologi- 
cally. However, there are large portion of the configuration 
space where A > 1, and the dynamics finds itself in the 
chaotic phase. Additionally, there are significant regions 
where the fixed point of b* is unstable ans is replaced by 
a period-2 oscillation, as discussed previously. In this sit- 
uation (which are marked in white in Figs |2] and p5j), the 
value of A is not meaningful, since it supposes that b* is 
a stable fixed point, and the networks are neither in the 
frozen nor in the chaotic phase. 

The values of A and A s in Eqs. [8] and [9] can be verified 
empirically, by constructing RBNs and obtaining the so- 
called Derrida plots [26], as shown in Fig. EJ These plots 
show the normalized hamming distance rt+i at time t + 1 
between two identical copies of the network, after only 
one of them had a random fraction of r t nodes flipped at 
time t. As can be seen on the top of Fig. [4] the different 
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Fig. 4. Fraction of perturbed nodes r t+ i after one time 
step, as a function of an initial fraction of random per- 
turbed nodes r t , for parameter values marked as in Figs|2] 
and[3j and shown in the legend. The solid lines are slopes 
of the form r t+1 = Xr t . with A or A s calculated according 
to Eqs(8]or|9] respectively. 

networks (even those with the same value of A) have a 
different perturbation propagation. However, if we only 
consider small values of r t , these curves are matched quite 
exactly by slopes of type Tt+i = Art, a s can be seen on 
the bottom of Fig. |4j 



5 Conclusion 

We have obtained the phase diagram of random Boolean 
networks with nested canalizing functions. Using the an- 
nealed approximation, we have analytically calculated the 
fraction b t of nodes with value one, and the sensitivity A 
of the network to small perturbations. We compared the 
results with numerical realizations of quenched networks, 



which have shown an excellent agreement. We have seen 
that the steady state value b* = lim^oo b t displays two 
interesting features for some parameter combinations: 1. 
For a probability c — 1 that the canalizing value of a ran- 
dom input is one, and a probability d — that the default 
output of a random input is one (and symmetrically for 
d = 1 and c = 0), there is a second-order transition from 
b* = to b* > at critical fraction a = 1/k of inputs 
which are activators; and 2. If \d — c\ is small enough, b t 
will oscillate between two values of b* , up to a value of a 
for which the fixed point of bt will become stable again. A 
similar oscillation is also observed in RBNs with threshold 
functions [27,. 

We have also observed that the phase diagram has 
large regions where A > 1, and thus the system is in 
the chaotic phase. This contradicts the claim by Kauff- 
man et al. |16| that this class of functions always leads 
to A < 1. The results in [T§] were obtained for specific 
in-degree distributions, and a selection of the canalizing 
values according to specific probabilities. These probabili- 
ties were chosen so that they match the experimental data 
they analysed, but for which no other explanation was of- 
fered. This distribution was then extrapolated to obtain 
the entire phase diagram, which resulted only in values of 
A < 1. In this work, the functions were chosen with differ- 
ent probabilities, according to the simple parameters a, c 
and d, which resulted in a phase diagram with large por- 
tions where A > 1. Therefore, since there seems to be no 
reason to believe the specific parametrization in |16| has a 
general character, it should not be concluded that the exis- 
tence of NCFs always leads to A < 1. On the other hand, 
the values of A for RBNs with NCFs are much smaller 
than RBNs with all functions chosen with equal probabil- 
ity. Thus the conclusion in [TS] that NCFs convey more 
stability to the system seem well justified, and it may in- 
deed be a reason why these functions are observed in real 
biological systems. 

This work has been supported by the DFG under Con- 
tract No. Dr300/5-l. 
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